Do AI, low-level framework or upper-level application?

Do the bottom-level AI framework and the upper-level AI application, which one will promote one's academic level (or comprehensive ability) more? This issue has caused a lot of discussion on Zhihu. This article sorts out the wonderful answers from users such as Jie Junyuan and Fine Tuning, and shares them with readers.
How to choose the left-hand "low-level AI framework" and the right-hand "upper AI application"?
For people doing AI-related work, which direction to choose may be a problem that needs to be deeply entangled.
Zhihu raised this question on the user, which caused a lot of attention and discussion:
Xinzhiyuan was authorized by the two users to understand Junyuan and fine-tuning, and sorted out their in-depth analysis of this issue and shared it with readers.
Be focused, but you need to understand both
As a person who converts deep learning to systems, I am also reflecting on a question recently: Is the core of a deep learning system (Deep Learning System) deep learning or a system?
Letâ€™s start with the conclusion: whether you want to do deep learning or a deep learning system, you need to understand both aspects of knowledge at the same time. You can focus on your own direction, but you must not completely understand one aspect, otherwise it will be difficult to make a difference. Useful results in practice.
First of all, let's take a look at the development teams of popular frameworks and their driving forces for development frameworks:
Caffe: Developed by Jia Yangqing and Berkeley Vision Lab. At the beginning, it was mainly for self-use, which was demand-driven.
Torch: A student of Yann LeCun. Demand-driven.
Theano: A student of Yoshua Benjio. Used for own research, but also published a systematic paper, which is driven by demand + research.
Tensorflow: The Google employees led by Jeff Dean are mainly from systems. It stems from Google's layout requirements in the AI â€‹â€‹field, driven by capital.
Neon: nervana employee, as a product of a startup company. Capital driven.
MXNet: A small partner of DMLC (mainly Chinese machine learning and distributed system students). Mainly Minerva, Purine, and cxxnet's development team work together, half of which are engaged in machine learning, and half are engaged in systems. Demand + interest driven.
There are still many frameworks developed by people who engage in systems out of interest or scientific research purposes, but most of them have not become popular, so I wonâ€™t repeat them.
It can be seen that, except for Google's forced Tensorflow, most of them start from self-use and interest. And Tensorflow's development funding is dozens of times more than the funding of all other frameworks combined, but it has not been able to dominate the rivers and lakes in a year. It can be seen that the power of demand-driven, the so-called "need is the mother of invention."
Why do mainstream deep learning frameworks mostly come from "people who know a little bit of systems and engage in deep learning" instead of "people who know a little bit of deep learning and engage in systems"?
I think the main reason is that there is an essential difference between deep learning systems and traditional systems (such as operating systems, databases): the coupling of each part of the deep learning algorithm is very close, which affects the whole body.
The thinking of system people is that I make a system, define the interface, ensure that the interface is correct, and the user can use it without knowing the implementation details. After all, you donâ€™t need to understand the file system format to use an operating system, and you donâ€™t need to understand how consistency is achieved when using a database.
But this set of thinking is not suitable for deep learning systems.
First, a data matrix flows through the entire system, and the details of each step may affect the results after a hundred steps. For intermediate results, you cannot strictly define what is correct. A good algorithm is not a simple superposition of N good parts. Hinton said that Dropout looks like a bug, but it improves accuracy, so it is a "good bug".
Second, because the deep learning algorithm is complex and there are many factors that need to be controlled, it is difficult for a fixed interface to meet the needs of all users. It is better to make the system simple and flexible, so that users can easily modify it according to their needs.
Conversely, for people engaged in deep learning, if you don't understand the internal details of the system, when your algorithm works well, you don't know what factors lead to good results. Maybe you change a framework, the effect is not good, and the reason is a certain implementation detail that you don't know at all. When the effect is not good, you don't know how to improve.
On the other hand, when you need to implement a new algorithm, you will often find that the existing interfaces of the framework cannot solve your problems. At this time, you need to understand the system internals to modify the system to achieve your own goals.
The bottom layer is more difficult to develop, and the upper layer is more grounded
I met Mr. Hu Xia from TAMU at the meeting last week. He introduced Auto Keras, an open source framework for automatic machine learning recently developed by his group. Teacher Hu said in the original words: "It is a very meaningful thing to do an open source framework, especially if your work is paid attention to and used by many people in a short period of time, it is very fulfilling."
Indeed, many people in the industry are gradually setting their sights on the lower level and closer to the "infrastructure" direction, such as automatic parameter adjustment, large-scale machine learning, and parallel machine learning. After all, if a good algorithm wants to be used by more people, it needs to lower the threshold of use and provide a common framework. Assuming that without Sklearn, it is estimated that there will be at least half of the number of people doing machine learning. If there is no TF or Torch, the number of people doing deep learning is estimated to be less than half.
In fact, in a strict sense, from proposing algorithms, encapsulating algorithms, and applying them to real data sets is a pipeline operation, which is a work from upstream to downstream. One of my observations is that many people doing algorithm research write very rough code, and the operating efficiency may be very low.
For a simple example, when you show a simple K-nearest neighbor algorithm, you can write it to re-search every time, or you can construct a KD tree first to reduce the time complexity. From the perspective of Cong logic alone, the former and the latter are both correct, but the efficiency may be quite different.
This phenomenon makes the results of most cutting-edge research not easy to land because the code is not optimized or there are various bugs in the implementation. I think a very good breakthrough perspective is to study how to efficiently implement various traditional and cutting-edge algorithms, from the simplest vectorization and parallel operations, to more complex structural designs and even large-scale parallel computing. If the underlying framework is done well, it will have great significance for industry and scientific research:
The industry can quickly try cutting-edge algorithms and verify the reliability and practicability of the algorithms on real data.
The scientific research community can fairly compare cutting-edge algorithms to prevent scientific research fraud. Many papers claim that their algorithm is far superior to the current best algorithm (SOTA), but in fact it may simply be because they did not implement SOTA correctly.
Since last year, I have tried to make some small wheels and also made some small frames. There are many new feelings in this process:
It is easy to design and implement the framework. Finding the shortcomings in the original algorithm helps inspire new ideas. Take the K-nearest neighbor-based algorithm as an example. If you find that the efficiency of the entire program is restricted by the K-nearest neighbor part during implementation, you can try to use the KD tree to accelerate or even replace the K-nearest neighbor step, and use clustering to simulate this process. So when you understand the bottleneck of the algorithm, you can propose new and meaningful improvements to feed back academic research.
Enhance your own realization ability and avoid indulging in paper talk after research. The traditional classification method that has received the most attention in the past two years is Chen Tianqi's XGBOOST, which is indeed very easy to use. I think the success of XGBOOST can be attributed to the fact that the algorithm was packaged into a mature tool library very early, which is based on the profound system design and implementation skills of Mr. Chen Tianqi. I think there must have been some very good algorithms in the past ten years, just because their authors could not encapsulate them into mature wheels for everyone to use, which is a pity.
It is more in line with the positioning of the industry and adds points to the job search. In fact, in most cases, the industry doesn't care how many powerful articles you have posted, but cares more about whether you can meet the company's needs. My own experience is that not many people are interested in my hydrology even if it is an academic conference, but more about my experience in developing frameworks, because they have not only heard of it, but also users.
A sense of accomplishment. There are far more users of the framework than readers of the paper. When you find that the framework you design is widely used by people all over the world, you will have a strong sense of accomplishment and feel that you have made a contribution to the development of this field. Contribute little by little, instead of just writing some hydrology that no one will read in this life.
The above point of view is mainly to discuss whether to try to learn the development framework and try to create some new wheels. Back to the topic, which "upper application" is better for the "low-level framework"? My opinion is that it depends on the skills you have:
The underlying framework: The difficulty lies in packaging and performance. For example, how to design the API (interface), how to improve the running speed and optimize, how to write a good test to ensure that the method is correct.
Upper-level application: The difficulty lies in how to use the existing wheels on real data, which involves many practical problems such as data cleaning, such as understanding how to call the underlying functions correctly.
Generally speaking, most people are not suitable for writing the bottom layer. After all, there are many excellent frameworks, and the requirements for system architecture and code optimization are very high, and most people do not have the required knowledge.
The upper-level applications are more grounded and can deepen our sensitivity to data. Students who are good at upper-level applications will also be job offer harvesters. In fact, it is not easy to be able to do a good job in the upper-level application, which requires a deep understanding of the problem.
In other words, the bottom-layer framework and the upper-layer application are divided into different cakes, with different focuses.
From a research perspective, inventing an algorithm should not be the end. As the proponent of the algorithm, he should implement his own model by himself. After all, the fragrance of wine is also afraid of the deep alley.

Wall Mounted AC Ev Charger
China Home Ev Charger,Wall-Mounted Ac Ev Charging Station manufacturer, choose the high quality Wall-Mounted Ac Ev Charger,Wall-Mounted Ac Ev Charging Pile, etc.
Home Ev Charger,Wall-Mounted Ac Ev Charging Station,Wall-Mounted Ac Ev Charger,Wall-Mounted Ac Ev Charging Pile
Shenzhen Hongjiali New Energy Co., Ltd. , https://www.hjlcharger.com